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In the world of multivariate extremes, estimation of the dependence structure still presents a 
challenge and an interesting problem. A procedure for the bivariate case is presented that opens 
the road to a similar way of handling the problem in a truly multivariate setting. We consider a 
semi-parametric model in which the stable tail dependence function is parametrically modeled. 
Given a random sample from a bivariate distribution function, the problem is to estimate the 
unknown parameter. A method of moments estimator is proposed where a certain integral of 
a nonparametric, rank-based estimator of the stable tail dependence function is matched with 
the corresponding parametric version. Under very weak conditions, the estimator is shown to 
be consistent and asymptotically normal. Moreover, a comparison between the parametric and 
nonparametric estimators leads to a goodness-of-fit test for the semiparametric model. The 
performance of the estimator is illustrated for a discrete spectral measure that arises in a factor- 
type model and for which likelihood-based methods break down. A second example is that of a 
family of stable tail dependence functions of certain meta-elliptical distributions. 

Keywords: asymptotic properties; confidence regions; goodness-of-fit test; meta-elliptical 
distribution; method of moments; multivariate extremes; tail dependence 

1. Introduction 

A bivariate distribution function F with continuous marginal distribution functions Fi 
and F2 is said to have a stable tail dependence function I if for all .t > and y > 0, the 
following limit exists: 

linit"ip{l - Fi{X) < tx or 1 - F2{Y) <ty} = l{x,y); (1.1) 

see [6, 15]. Here, {X,Y) is a bivariate random vector with distribution F. 

The relevance of condition (1.1) comes from multivariate extreme value theory: if Fi 
and F2 are in the max-domains of attraction of extreme value distributions Gi and 
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G2 and if (1.1) holds, then F is in the max-domain of attraction of an extreme value 
distribution G with marginals Gi and G2 and with copula determined by Z; see Section 2 
for more details. 

Inference problems on multivariate extremes therefore generally separate into two 
parts. The first one concerns the marginal distributions and is simplified by the fact 
that univariate extreme value distributions constitute a parametric family. The second 
one concerns the dependence structure in the tail of F and forms the subject of this 
paper. In particular, we are interested in the estimation of the function I. The marginals 
will not be assumed to be known and will be estimated nonparametrically. As a conse- 
quence, the new inference procedures are rank-based and therefore invariant with respect 
to the marginal distribution, in accordance with (1.1). 

The class of stable tail dependence functions does not constitute a finite-dimensional 
family. This is an argument for nonparametric, model-free approaches. However, the ac- 
curacy of these nonparametric approaches is often poor in higher dimensions. Moreover, 
stable tail dependence functions satisfy a number of shape constraints (bounds, homo- 
geneity, convexity; see Section 2) which are typically not satisfied by nonparametric 
estimators. 

The other approach is the semiparametric one, that is, we model / parametrically. 
At the price of an additional model risk, parametric methods yield estimates that are 
always proper stable tail dependence functions. Moreover, they do not suffer from the 
curse of dimensionality. A large number of models have been proposed in the literature, 
allowing for various degrees of dependence and asymmetry, and new models continue to 
be invented; see [1, 20] for an overview of the most common ones. 

In this paper, we propose an estimator based on the method of moments: given a 
parametric family {lg:9 £ 0} with C and a function g : [0, 1]^ Kp , the moment 
estimator 0„ is defined as the solution to the system of equations 



Here, In is the nonparametric estimator of I. Moreover, a comparison of the parametric 
and nonparametric estimators yields a goodncss-of-fit test for the postulated model. 

The method of moments estimator is to be contrasted with the maximum likelihood 
estimator in point process models for extremes [5, 17] or the censored likelihood approach 
proposed in [21, 23] and studied for single-parameter families in [14]. In parametric mod- 
els, moment estimators yield consistent estimators, but often with a lower efficiency than 
the maximum likelihood estimator. However, as wc shall see, the set of conditions required 
for the moment estimator is smaller, the conditions that remain to be imposed are much 
simpler and, most importantly, there are no restrictions whatsoever on the smoothness (or 
even on the existence) of the partial derivatives of /. Even for nonparametric estimators 
of I, asymptotic normality theorems require I to be differentiablc [6, 7, 15]. 

Such a degree of generality is needed if, for instance, the spectral measure underlying 
/ is discrete. In this case, there is no likelihood at all, so the maximum likelihood method 
breaks down. An example is the linear factor model X = I3F -\- e, where X and e are 
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dx 1 random vectors, is a m x 1 random vector of factor variables and /3 is a constant 
dx m matrix of factor loadings. If the m factor variables are mutually independent and if 
their common marginal tail is of Pareto type and heavier than those of the noise variables 
El, . . . , ed; then the spectral measure of the distribution of X is discrete with point masses 
determined by f3 and the tail index of the factor variables. The heuristic is that if X is far 
from the origin, then with high probability, it will be dominated by a single component 
of F. Therefore, in the limit, there are only a finite number of directions for extreme 
outcomes of X . Section 5 deals with a two- factor model of the above type, which gives 
rise to a discrete spectral measure concentrated on only two atoms. For more examples 
of factor models and further references, see [11]. 

The paper is organized as follows. Basic properties of stable tail dependence functions 
and spectral measures are reviewed in Section 2. The estimator and goodness-of-fit test 
statistic are defined in Section 3. Section 4 states the main results on the large-sample 
properties of the new procedures. In Section 5, the example of a spectral measure with 
two atoms is worked out and the finite-sample performance of the moment estimator 
is evaluated via simulations; Section 6 carries out the same program for the stable tail 
dependence functions of elliptical distributions. All proofs are deferred to Section 7. 

2. Tail dependence 

Let {X,Y),{Xi,Yi), . . . ,{Xn,Yn) be independent random vectors in with common 
continuous distribution function F and marginal distribution functions Fi and F2. The 
central assumption in this paper is the existence, for all {x,y) G [0,oo)^, of the limit 
/ in (1.1). Obviously, by the probability integral transform and the inclusion-exclusion 
formula, (1.1) is equivalent to the existence, for all {x,y) £ [0,oo]^ \ {(cx),oo)}, of the 
limit 

limi-ip{l - Fi{X) <tx,l- F2{Y) < ty] = R{x,y), (2.1) 

so R{x, 00) — R{oo,x) = X. The functions I and R arc related by R{x, y) = x + y — l{x , y) 
for {x,y) e [0,oo)^ Note that i?(l,l) is the upper tail dependence coefficient. 

If C denotes the copula of F, that is, if F{x,y) = C{Fi(a;), ^2(2/)}, then (1.1) is equiva- 
lent 
to 

limt"^{l - C(l - te, 1 - ty)} = l{x, y) (2.2) 
for all X, ?/ > and also to 

lim C"(wi/", w^/") = exp{-/(- logu, - \ogv)} =: Coo{u, v) 

n — >oo 

for all (Uj-y) € (0, 1]^. The left-hand side in the previous display is the copula of the pair 

of componentwise maxima (max^^i „ X^, maxi=i....^„ Y^) and the right-hand side is the 

copula of a bivariate max-stablc distribution. If, in addition, the marginal distribution 
functions Fi and F2 are in the max-domains of attraction of extreme value distributions 
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Gi and G2, that is, if there exist normahzing sequences a„ > 0, c„ > 0, 6„ G M and dn G K 
such that F"(a„a; + 6„) -i 01(2:) and FJ'(c„i/ + c?„) -i G2{y), then actually 

F'\anX + bn, c„y + d„) -i G(x, y) = Coo{Gi(x), G2(y)}, 

that is, F is in the max-domain of attraction of a bivariate extreme value distribution 
G with marginals Gi and G2 and copula Goo- However, in this paper, we shall make no 
assumptions whatsoever on the marginal distributions Fi and F2, except for continuity. 

Directly from the definition of I, it follows that x\/y< l{x,y) < x + y for all {x,y) G 
[0, 00)'^. Similarly, < R{x, y) <x Ay for {x, y) £ [0, cxd)^. Moreover, the functions I and 
R are homogeneous of order one: for all (x, y) € [0, 00)^ and all t > 0, 

l{tx,ty) ^tl{x,y), R{tx,ty) ^tR{x,y). 

In addition, I is convex and R is concave. It can be shown that these requirements on 
I (or, equivalently, R) are necessary and sufficient for ^ to be a stable tail dependence 
function. 

The following representation will be extremely useful: there exists a finite Borel mea- 
sure H on [0, 1], called spectral or angular measure, such that for all (x,y) G [0,oo)^, 



l{x,y) = 



/ max{wx,{l — w)y}H{dw), 
J[o,i] 

' (2.3) 

R{x, y) = / minima::, (1 — w)y}H{dw). 

J[Q,1] 

The identities l{x,0) — l{0,x) ~ x for all a; > imply the following moment constraints 
for H: 



wH{dw)= / {l-w)H{dw) = l. (2.4) 

[0,1] J[0,1] 

Again, equation (2.4) constitutes a necessary and sufficient condition for I in (2.3) to be 
a stable tail dependence function. For more details on multivariate extreme value theory, 
see, for instance, [1, 4, 8, 10, 13, 22]. 



3. Estimation and testing 

Let Rf and Rf be the rank of Xi among Xi, . . . , Xn and the rank of Yj among Yi,. . . ,Yn, 
respectively, where i = 1, . . . ,n. Replacing P, Fi and F2 on the left-hand side of (1.1) 
by their empirical counterparts, we obtain a nonparametric estimator for I. Estimators 
obtained in this way are 

1 " 

Llix, y)-^^Yl ^{^^ > " + 1 - kx or i?f > 71 + 1 - ky}, 
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1 " 

■= I H ^^^^ > n + 1 - fcx or i?f > n + 1 - ky}, 



k 

1=1 



defined in [7] and [6, 15], respectively (here, k£ {1, . . . ,n}). The estimator we will use 
here is similar to those above and is defined by 

1 " f 1 1 1 

Lix, y):^-'^ll Rf >n+--kx or RY >n+--kyy 

i=i ' 

For finite samples, simulation experiments show that the latter estimator usually per- 
forms slightly better. The large-sample behaviors of the three estimators coincide, how- 
ever, since < < /„ and, as n —s- oo, 

sup \VkiL{x,y)-Llix,y))\<^^0, (3.1) 

0<x,y<l VfC 

where k = kn is an intermediate sequence, that is, fc — > oo and k/n —f 0. 

Assume that the stable tail dependence function I belongs to some parametric family 
{/(•, ■;d) -.d 8}, where 9 C M.P, p>l. (In the sequel, we will write l{x, y; 9) instead of 
le{x,y).) Observe that this does not mean that C (or F) belongs to a parametric family, 
that is, we have constructed a semiparamctric model. Let g : [0, 1]^ W be an integrable 
function such that ip:0 defined by 



m--^ g{x,y)l{x,y-e)Axdy (3.2) 

J J[0,l]2 

is a homeomorphism between 8°, the interior of the parameter space 8, and its image 
<p(8°). For examples of the function tp, see Sections 5 and 6. Let 6*0 denote the true 
parameter value and assume that 0o € 8°. 

The method of moments estimator On of 6q is defined as the solution of 



that is. 



g{x,y)ln{x,y)dxdy= II g{x,y)l{x,y;en)dxdy ^ ip{0n), 

[0,1]2 "'"'[0,1]^ 



whenever the right-hand side is defined. For definiteness, if JJ gin 4- '^^(0°); Ist Bn be some 
arbitrary, fixed value in 8. 

Consider the goodness-of-fit testing problem, Tio : I S {^(■, ■;0):6 <=i 8} against Tia : / ^ 
{/(•,•; 6*) : 6* e 8}. We propose the test statistic 

{in{x,y) -l{x,y]0n)Y dxdy, (3.4) 

[0,1]2 

with On as in (3.3). The null hypothesis is rejected for large values of the test statistic. 
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The method of moments estimator is consistent for every intermediate sequence k = kn 
under minimal conditions on the model and the function g. 

Theorem 4.1 (Consistency). Let g ■.[0,1]'^ ^M.p be integrable. If (p in (3.2) is a home- 
omorphism between Q° and and if 9q G 0° , then a.s n —f oo, A: — > oo and k/n — > 0, 

the right-hand side of (3.3) is well defined with probability tending to 1 and 9n Sq. 

Denote hy W a mean-zero Wiener process on [0, oo]^ \ {(oo, oo)} with covariance func- 
tion 

EW{xi,yi)W{x2,y2) = R{xi /\x2,yi /\y2) 
and for x,y € [0, oo) , let 

Wi{x):=W{x,oo), W2{y):^W{oo,y). 

Further, for (a;, y) G [0, oo)^, let Ri{x, y) and R2(x, y) be the right-hand partial derivatives 
of R at the point (a;, y) with respect to the first and second coordinate, respectively. Since 
R is concave, i?i and R2 defined in this way always exist, although they arc discontinuous 
at points where -^R{x,y) or -^R{x,y) do not exist. 

Finally, define the stochastic process B on [0,oo)^ and the p-variate random vector B 

by 

B{x,y) = W{x,y) - Ri{x,y)Wi{x) - R2{x,y)W2{y), 

B=ll g{x,y)B{x,y)dxdy. 
J J[o,i]^ 



Theorem 4.2 (Asymptotic normality). In addition to the conditions in Theorem 4-1, 
assume the following: 

(CI) the function Lp is continuously differentiable in some neighborhood of Oq and its 
derivative matrix D^{9q) is invertible; 

(C2) there exists a > such that as t ^ 0, 

t-^F{l - Fi{X) <tx,l- F2{Y) < ty} - R{x,y) = 0(r), 

uniformly on the set {(x,y) : x + y ~ l,x >0, y > 0}; 
(C3) k^kn^oo and k^o{n'^°'^^^+^°''>) as 00. 

Then 

Vk0n^9o)^D^ido)-'B. (4.1) 
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Note that condition (C2) is a second-order condition quantifying the speed of conver- 
gence in (2.1). Condition (C3) gives an upper bound on the speed with which k can grow 
to infinity. This upper bound is related to the speed of convergence in (C2) and ensures 
that On is asymptotically unbiased. 

The limiting distribution in (4.1) depends on the model and on the auxiliary function 
g. The optimal g would be the one minimizing the asymptotic variance, but this mini- 
mization problem is typically difficult to solve. In the examples in Sections 5 and 6, the 
functions g were chosen so as to simplify the calculations. 

From the definition of the process B, it follows that the distribution of B is p-variate 
normal with mean zero and covariance matrix 



S(0o) = Var(S) = / / / / g{x,y)g{u,v)' a{x,y,u,v;eo)dxdydudv, (4.2) 

J J J J[0,l]'i 

where a is the covariance function of the process B, that is. for 6* G 0, 

a{x, y, u, V] 9) ~ ¥.B(x, y)B{u, v) 

= R{x Au,y Av;d) + Ri{x,y;d)Ri{u,v;6){x A u) 

+ R2ix, y; e)R2{u, v; 9){y Aw) - 2i?i(u, v; d)R{x A u, y; 9) 
- 2i?2(w, V] 9)R{x, y/\v;9) + 2Ri {x, y; 9)R2{u, v; 9)R{x, v; 9). 



(4.3) 



Denote by Hg the spectral measure corresponding to l{-,-]9). The following corollary 
allows the construction of confidence regions. 

Corollary 4.3. Under the assumptions of Theorem J^.'i, if the map 9 ^ Hg is weakly 
continuous at Oq and if Yi{9q) is non-singular, then, as n— s-oo, 

fc(4 - 9o)'^D^{9n)'^J:{9n)-^D^{9n){9n - 9o) ^ xl- 
Finally, we derive the limit distribution of the test statistic in (3.4). 

Theorem 4.4 (Test). Assume that the null hypothesis Ti.o holds and let 9ua denote the 
true parameter. If 

(1) for all 9q € Q the conditions of Theorem 4-2 are satisfied (and hence Q is open); 

(2) on Q, the mapping 9^ l{x,y;9) is differentiable for all (x,y) £ [0,1]^ and its 
gradient is bounded in {x,y) € [0,1]^, 



then 



k{in{x,y) - l{x,y;9njf dxdy 

[0,1]2 

{Bix,y) - Di(^,^y,e-)i9no)D^{9Ho)~^Bf dxdy 

[o,ir- 
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as n ^ oo, where £';(i^,j,:0)(^'Ho) gradient of ^-^ l[x, y; 9) at fl^^,, . 

5. Example 1: Two-point spectral measure 

The two-point spectral measure is a spectral measure H that is concentrated on only two 
points in (0, 1) \ {1/2} - call them a and 1 — 6. The moment conditions (2.4) imply that 
one of those points is less than 1/2 and the other one is greater than 1/2, and the masses 
on those points are determined by their locations. For definiteness, let a E (0,1/2) and 
1 — 6 € (1/2, 1), so the parameter vector 6 = (a, b) takes values in the square 8 = (0, 1/2)^. 
The masses assigned to a and 1 — 6 are 

q:=H{{a}) = ^—^ and 2-q^H{{l-b}) " 



1 — a — 5 1 — a — 6 

This model is also known as the natural model and was first described by Tiago de 
Oliveira [24, 25]. 

By (2.3); the corresponding stable tail dependence function is 

l{x, y; a, b) = gmaxjax, (1 — a)y} + {2 — q) max{(l — b)x, by}. 

The partial derivatives of I with respect to x and y are 

a 

1, if y < 2^, 



a/(x,,;a,5)^ ^^_^^(2-,), it--x<y<—^x, 



dx 



1 

a 1-6 

1 ~" 

1-6 



0, if 2/ > — ; — X 



and {d/dy)l{x,y;a,b) = {d/dy)l{y,x]b,a). Note that the partial derivatives do not exist 
on the lines y = jz^x and y = ^^x. The same is true for the partial derivatives of R. 
As a consequence, the maximum likelihood method is not applicable and the asymptotic 
normality of the nonparametric estimator breaks down. However, the method of moments 
estimator can still be used since, in Theorem 4.2, no smoothness assumptions whatsoever 
are made on I. 

As explained in the Introduction, discrete spectral measures arise whenever extremes 
are determined by a finite number of independent, heavy-tailed factors. Specifically, let 
the random vector {X, Y) be given by 

(X, Y) = {aZi + (1 - a)Z2 + £i, (1 - /3)^i + /3^2 + £2), (5.1) 

where < a < 1 and < /? < 1 are coefficients and where Zi, Z2, £1 and £2 arc in- 
dependent random variables satisfying the following conditions: there exist v > and 
a slowly varying function L such that F{Zi > z) = z~'^L{z) for some v > i = 1,2; 
P(£j > z)/P{Zi > z) ^ as z — > 00, j = 1, 2. (Recall that a positive, measurable function 
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L defined in a neighborhood of infinity is cahed slowly varying if L{yz) / L{z) — > 1 as 
z ^ oo for all y > 0.) Straightforward, but lengthy, computations show that the spectral 
measure of the random vector (X, Y) is a two-point spectral measure having masses q 
and 2 — q at the points a and 1 — b, where 



a'' + (l-a)'' ' a'^-l-(l-a) 

Write A = {(.t, y) E [0, 1]^ : a; + y < 1} and let 1a be its indicator function. The function 
(7a: [0,1]^ defined by gA{x,y) = 1a{x, y){x,y)^ is obviously integrable and the 

function ip in (3.2) is given by 



tp{a,b) ^ J JJ^x,y) l{x,y]a,b)dxdy^ {J{a,b),K{b,a)) , 

where K(a, b) — J{b, a) and 

J(a, b) = ^{(2a6 - a - b){b - a + 1) + a{b - 1) + 3}. 
Nonparametric estimators of J and K are given by 

{Jnjin)^ // (a;,y)^f„(x,?/)dxdy 



J J A 

and the method of moment estimators (a„,&„) are defined as the solutions to the equa- 
tions 

{Jn,Kn) = (J(a„,6„),if(a„,6„)). 

Due to the explicit nature of the functions J and K , these equations can be simplified: 
if we denote c,/^„ :~ 3(8 J„ — 1) and CK,n '■— 3(8A'„ — 1), the estimator 6„ of b will be a 
solution of the quadratic equation 

3(2cj,„ -I- 2cKji + 3)6^ + 3(-5cj,„ + ck^ - 3)6 + 3cj,„ - 6ck,„ - (cj,„ -t- c/c,„)^ = 

that falls into the interval (0, 1/2) and the estimator of a is 

36„ -I- C,/,„ -I- CK.n 
On = — . 

6o„ - 3 

In the simulations, we used the following models: 

(i) Zi, Z2 ~ Frechet(l), so = 1, and 61,62^ N{0, 1) (Figures 1, 2, 3); 

(ii) Zi,Z2-i2, so i/ = l/2, andei,e2~A^(0,0.52) (Figures 4, 5, 6). 
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(n) BiEiK of c'sf inialor of ti (b) RMSE of estimator of a 



Figure 1. Model (5.1) with Zi, ~ Frechet(l), ei,e2 ~ A^(0, 1), ao = feo = 0.001. 



The figures show the bias and the root mean squared error (RMSE) of a„ and 6„ for 1000 
samples of size n = 1000. The method of moments estimator performs well in general. 
We see a very good behavior when ao = 60 ~ 0. Of course, the heavier the tail of Zi, the 
better the performance of the estimator. 




(ft) Bias of esjtimator of a (h) RMSE of estimator of a 

Figure 2. Model (5.1) with Zi, - Frechet(l), ei,e2 iV(0, 1), ao = feo = 0.3125. 




6. Example 2: Parallel meta-elliptical model 

A random vector (X, Y) is said to be elliptically distributed if it satisfies the distributional 
equality 

{X,Y)'^ = H + ZAU, (6.1) 

where is a 2 x 1 column vector, Z is a positive random variable called the generating 
random variable, A is a 2 x 2 matrix such that S = AA^ is of full rank and i7 is a 




{a) BiHB of estimator of a (b) RMSE of estimator of a 

Figure 4. Model (5.1) with Zi,Z2~t2, ei, £2 - Af(0, 0.5^), ao = 6o = 0.001. 
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two-dimensional random vector independent of Z and uniformly distributed on the unit 



circle {{x, y) e 
written as 



y 



1}. Under the above assumptions, the matrix S can be 



y pav 



pav 



(6.2) 



where a > 0, v > and — 1 < p < 1. The special case p = yields the subclass of parallel 
elliptical distributions. 



0^4 



-0,1 



anas 



a„=0.12S, b„=0.37S 













Ur 



0.1 
DOS 
DM 

our 

O.K 

ts. 

aoz 



0.124 



(US 



(a) Bias of estimators of n and ft 



(b) RMSE of Gstimalor.s of n and it 



Figure 6. Model (5.1) with Zi,Z2^t2, ei, £2 - Af(0, 0.5^), ao = 0.125, 6o = 0.375. 
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By [16], the distribution of Z satisfies F{Z > z) = z~'^L{z) with u > and L slowly 
varying if and only if the distribution of {X, Y) is (multivariate) regularly varying with 
the same index. Under this assumption, the function R of the distribution of {X, Y) was 
derived in [18]. In case p = 0, the formula specializes to 

X (cos ^rd^ + y //^"^"^ (sin 0)- d0 
R{x,y;i^)= -— ^ (6.3) 

with f{x,y]v) = aicia.\i{{x/yY/'^}. Hence, the class of stable tail dependence functions 
belonging to parallel elliptical vectors with regularly varying generating random variables 
forms a one-dimensional parametric family indexed by the index of regular variation 
V G (0,00) = 8 of Z. We will call the corresponding stable tail dependence functions I 
parallel elliptical. 

In [9], meta- elliptical distributions are defined as the distributions of random vectors 
of the form (s(X),t(y)), where the distribution of [X^Y) is elliptical and s and t are 
increasing functions. In other words, a distribution is meta-elliptical if and only if its 
copula is that of an elliptical distribution. Such copulas are called meta-elliptical in [12] 
(note that a copula, as a distribution function on the unit square, cannot be elliptical in 
the sense of (6.1)). Since a stable tail dependence function / of a bivariate distribution 
F is only determined by F through its copula C (see (2.2)), the results in the preceding 
paragraph continue to hold for meta-elliptical distributions. In the case p = 0, we will 
speak of parallel meta-elliptical distributions. In the case where the generating random 
variable Z is regularly varying with index v, the function R is given by (6.3). 

For parallel meta-elliptical distributions, the second-order condition (C2) in Theo- 
rem 4.2 can be checked via second-order regular variation of Z . 

Lemma 6.1. Let F he a parallel meta-elliptical distribution with, generating random 
variable Z . If there exist v > Q, [3 < Q and a function Ait) of constant sign near 
infinity such that 

nz>tx)/nz>t)-x-^ _ x^-i 

t-i A{t) ~ /? ' ^ ' 

then condition (C2) in Theorem ^.2 holds for every a £ {0,-/3/1/). 

Note that although the generating random variable is only defined up to a multi- 
plicative constant, condition (6.4) does makes sense: that is, if (6.4) holds for a random 
variable Z , then it also holds for cZ with c > 0, for the same constants v and (3 and 
for the rate function A*{t) A{t/c). Note that \A\ is necessarily regularly varying with 
index /3; see [2], equation (3.0.3). 

Now, assume that (Xi, Yi), . . . , (X„, Yn) is a random sample from a bivariate distribu- 
tion F with parallel elliptical stable tail dependence function /, that is, / e {/(•, ■;h') '.v G 
(0, 00)}, where l{x,y;v) = x-\-y — R{x, y; v) and R{x, y; v) is as in (6.3). We will apply the 
method of moments to estimate the parameter v. Since / is defined by a limit relation, our 
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1=1 




(a) Bias of estimator of f (b) RMSE of estimator of u 

Figure 7. Estimation of = 1 in the bivariate Cauchy model. 




(a) Bias of estimator of u ( b) BMSE of estimator of v 

Figure 8. Estimation oi v — \ in the model (Xi, Yi)^ = ZU , where Z is Frechet(l). 




(a) Bibs of estimator of u (h) RMSE of estimator of f 

Figure 9. Estimation of = 5 in the model {X\,Y-i_)^ = ZU , where Z is Frechet(5). 
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(a) the mean over 1000 replications of the estimate {h) the RMSE of llie estimator of 1; 1) 
of t;l) 

Figure 10. Estimation of R{1, 1; 1) in the bivariate Cauchy model. 



assumption on F is weaker than the assumption that F is parallel meta-elliptical with 
regularly varying Z, which, as explained above, is, in turn, weaker than the assumption 
that F itself is parallel elliptical with regularly varying Z. The problem of estimating the 
R for elliptical distributions was addressed in [18] and for meta-elliptical distributions 
was addressed in [19]. 

We simulated 1000 random samples of size n = 1000 from models for which the as- 
sumptions of Theorem 4.2 hold and which have the function R{-,-;v) as in (6.3), with 
v G {1,5}. The three models we used are of the type {Xi,Yi)'^ = ZU. In the first model, 
the generating random variable Z is such that ¥{Z > z) = (1 -I- z^)^^^^ for z > 0, that is, 
the first model is the bivariate Cauchy (i^ = 1). In the other two models, Z is Frecliet(zy) 
with vG {1,5}. 

Figures 7 to 9 show the bias and the RMSE of the moment estimator of i'. 
The auxiliary function g: [0,1]^ — > M is g{x,y) = l(x + y <\-). For comparison. Fig- 
ures 10 and 11 show the plots of the means and RMSE of the parametric and 
nonparametric estimates i?(l,l;i>„) and i?„(l,l) = 2 — Z„(l,l) of the upper tail 
dependence coefficient i?(l,l). We can see that the method of moments estima- 
tor of the upper tail dependence coefficient R{\^l;v) performs well. In particu- 
lar, it is much less sensitive to the choice of k than the nonparametric estima- 
tor. 

7. Proofs 



Proof of Theorem 4.1. First, note that 

/ / g{x,v)ln{x,y)dxdy - I I g{x,y)l{x,y;9o)dxdy 
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< sup \ln{x,y) -l{x,y;0o)\ \g{x,y)\dxdy. 

a<x,y<l J J [0,1]^ 

The second term is finite by assumption and 

sup \ln{x,y)-l{x,y;9o)\^0 

0<x.y<l 

by (3.1) and [15], Theorem 1; see also [6]. Therefore, as oo. 



g{x,y)lnix,y)dxdy ^ // gix,y)l{x,y;eo)dxdy = ip{eo). 
[o,i]2 J AoA]^ 

Since (p{0o) G (p{Q°), which is open, and since ip~^ is continuous at ip{Oo) by assumption, 

we can apply the function ip^^ on both sides of the previous hmit relation so that, by 

p 

the continuous mapping theorem, we indeed have On ^ Oq. □ 

For the proof of Theorem 4.2, we will need the following lemma, the proof of which 
follows from [8], Lemma 6.2.1. 

Lemma 7.1. The function R in (2.3) is differentiable at {x,y) £ (0,oo)^ if H{{z}) = 
with z~y/{x + y). In that case, the gradient of R is given by {Ri{x,y), Ra{x,y))'^ , where 

Ri{x,y)^ f wH(dw), R.2{x,y)=. [ {l~w)H{dw). (7.1) 




HAS 

a.: 



? OS 
t).TS 



Si US 



(a) the mean over ItKK) replications of the estimate (b) the RMSE of the estimator of R{\,\: Tj) 
of i?(l,l:5) 

Figure 11. Estimation of 7?(1, 1;5) in the model (Xi, Yi)""" — ZU , where Z is Frechet(5). 
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For i = 1, ... ,n, let Ui := 1 — Fi (Xi) and := 1 — ^2(1^). Let Qi„ and Q2n denote the 
empirical quantile functions of (Ui, . . . , C/„) and (Vi, . . . , Vn), respectively, that is, 

where [/i:„ < • • • < Un-.n and Vi:„ < • • • < Kiin are the order statistics and where [a] is 
the smallest integer not smaller than a. Define 

and 

1 " f ;„ k •] 

rC I 71 77-1 

z— 1 ^ ^ 

1 " 



4=1 



1 " 

- l{i?f > 77 + 1 - fcx, i?f > 77 + 1 - fcy}. 



RUx,y) ■.= r^[U,<—,V,<^ 



77 77 



Further, note that 



Write Vn{x,y) = \/k{Tn{x,y) - R„{x,y)), Vn,iix) ■.= Vn{x,oo) and Vnaiv) := Wn(oo,7/). 
From [7], Proposition 3.1 we get 

{vn{x,y),x,y & [Q,l];vn,i{x),x e [0,l];Vn,2{y),y e [0,1]) 

^{W{x,y),x,ye[0,l];Wiix),xe[0,l];W2iy),ye[0A]), 

in the topology of uniform convergence, as 77 — > 00. Invoking the Skorokhod construction 
(see, e.g., [27]) we get a new probabihty space containing all Vn, Vn,i, Vn,2,W, Wi, W2 for 
which it holds that 

{W,WuW2) = {W,Wi,W2) 
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as well as 



sup \in{x,y)-W{x,y)\°^' 

0<x,y<l 

sup \vnjix) - Wj{x)\ 0, j = 1,2. 

0<a;<l 

We will work on this space from now on, but keep the old notation (without tildes) . The 
following consequence of the above and Vervaat's lemma [28] will be useful 



\^fk{S,n{x) - x) + W,{x)\ "-4- 0, J = 1, 2. 



sup I V r^\^] 
0<a;<l 



(7.2) 



Proof of Theorem 4.2. In this proof, we will write l{x,y) and R(x,y) instead of 
l{x,y\6Q) and R{x,y;do), respectively. 
First, we will show that as n — > oo, 



[0,1] 



9ix,y)Ll{x,y)dxdy-ip{eQ) ] +B 



•0. 



Since, for each x,y £ (0, 1] 



almost surely, from 



(ii +i?i)(x,y) = . 
\kx] + \ky'] - 2 



[fcx] + [kyl - 2 



k 



x-y 



2 



it follows that 
Vk 



[0,1] 



9{x,y)Ll{x,y) dxdy 



[o.iY 



g{x,y)l{x,y) dxdy 



Vk 



[0,1]^ 



9{x,y)Rn{x,y)dxdy^ 



[0,1]^ 



g{x,y)R{x,y)dx dy 



g{x,y)Vk 



[0,1]^ 



[fcx] + [fcy] - 2 



X — y\dxdy 



almost surely. Hence, to show (7.3), we will prove 

g{x, y)Vk{Rl{x, y) - R{x, y))dxdy-B 
First, we write 

^{Rn{x,y) - R{x,y)) = Vk{Rn{x,y) - i?„(5i„(x), S'2„(y))) 



O 



0. 



(7.3) 



(7.4) 
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+ Vk{Rn{Sin{x), S2niy)) " R{Si„{x), S'2n (?/))) 

+ Vk{R{SUx),S2niy)) - R{x, y)). 

From the assumption on integrability of g and the proof of [7], Theorem 2.2, page 2003, 
we get 

\gix, y) \\^/k{Rl (x, y) - i?„ (5i„ (x), S2n {y))) -W{x,y)\Ax dy 

[0,1]2 

< sup \Vk{Riix,y)-RniSinix),S2n{y)))-W{x,y)\ f f \g{x,y)\dxdy (7.5) 

0<2;,i,<l "'"'[0,1]2 

P 

^0 

and, by conditions (C2) and (C3), 

j{x,y)\ \VkiRn{Sin (x) , S2n (y)) - (x) , S2n {y)))\ dx Ay 

< sup \^{Rn{Sln{x),S2n{y))-R{Sln{x),S2n{y)))\ ! ! \g{x,y)\Ax dy [1 .&) 

Take uj in the Skorokhod probabihty space introduced above such that supQ<;^<;]^ |Wi(x)| 
and suPq<j,<]^ |W^2(y)| are finite and (7.2) holds. For such lo, we wih show, by means of 
dominated convergence, that 

\gix, y) \\Vk{R{Si^{x), S2n{y)) ~ R{x, y)) 

[0,1? 

(7.7) 

+ Ri{x, y)Wi {x) + i?2 (x, y)W2 (y) | da; dy ^ 0. 

(i) Pointwise convergence of the integrand to zero for almost all {x,y) G [0,1]^. Con- 
vergence in {x,y) foUows from (7.2), provided R{x,y) is differentiablc. The set of points 
in which this might fail is, by Lemma 7.1, equal to 

Dn ■■= ( {x, y) e [0, 1]^ : H{{z]) > 0, z = 

I x + y 

Since if is a finite measure, there can be at most countably many z for which H{{z}) > 0. 
The set Dft is then a union of at most countably many lines through the origin and hence 
has Lebesgue measure zero. 

(ii) The domination of the integrand for all (x,y) S [0,1]^. Comparing (7.1) and the 
moment conditions (2.4), we see that for all (a;,y) G [0,1]^, it holds that \Ri{x,y)\ < 1 



1022 J.H.J. Einmahl, A. Krajina and J. Segers 

and \R2{x,y)\ < 1. Hence, for all {x,y) G [0,1]^, 

\g[x,y)\\Vk{R{Sin{x),S2n{y)) - R{x,y)) + Ri{x,y)Wi{x) + R2{x,y)W2{v)\ 

< \gix,y)\{Vk\R{Sin{x),S2niy)) - Rix,y)\ + + |W^2(y)|). 

We will show that the right-hand side in the above inequality is less than or equal to 
M\g{x,y)\ for all {x,y) e [0,1]^ and some positive constant M (depending on ui). For 
that purpose, we prove that 

sup Vk\R{SUx),S2n{y))-R{x,y)\ = 0{l). 

0<x,y<l 

The representation (2.1) implies that for all x,xi,X2,y,yi,y2 € [0, 1], 

\R{xi,y) - R{x2,y)\ < \xi - X2\, 
\R{x,yi) - R{x,y2)\ < \yi - y2\- 
By these inequalities and (7.2), we now have 

sup Vk\R{Slnix),S2n{y))-R{x,y)\ 
0<x,y<l 

< sup Vk\R{Sinix),S2niy)) - RiSi„{x),y)\+ sup Vk\R{Sin{x),y) - R{x,y)\ 

0<x,y<l 0<x,y<l 

< sup Vk\Sln{x) ~ x\ + sup Vk\S2n{y)~y\ 
0<x<l 0<y<l 

= 0(1). 

Recalling that supg<^<]^ |Wi(x)| and supo<j,<i |VK2(y)| are finite completes the proof of 
domination and hence the proof of (7.7). 

Combining (7.5), (7.6) and (7.7), we get (7.4) and therefore also (7.3). Property (3.1) 
provides us with a statement analogous to (7.3), but with replaced by In- That is, we 
have 

Using condition (CI) and the inverse mapping theorem, we get that ip~^ is continuously 
differentiable in a neighborhood of (p{9o) and D^~i{ip{9o)) is equal to Dip{9o)~^. By a 
routine argument, using the delta method (see, e.g.. Theorem 3.1 in [26]), (7.8) impHes 

that 

and since B is mean-zero normally distributed {B = —B), 



0. (7.8) 



Vk{k-9o)^D^{9o)-^B. □ 
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Lemma 7.2. Let Hg be the spectral measure and T,{6) the covariance matrix in (4-2). 
If the mapping i— > Hq is weakly continuous at 6q, then 9 ^Ti{9) is continuous at 9q. 

Proof. Let 0„ ~> 6*0- In view of the expression for S(6') in (4.2) and (4.3), the assumption 
that g is integrable and the fact that R, \Ri\ and |i?2| are bounded by 1 for aU 9 
and {x,y) £ [0,1]^, it sufhees to show that R{x,y;9n) ~* R{x,y;9) and Ri{x,y;9n) — > 
Ri{x,y;9) for i = 1,2 and for almost aU {x,y) £ [0, 1]^. 

Convergence of R for all {x,y) G [0,1]^ follows directly from the representation of R 
in terms of H in (2.3) and the definition of weak convergence. Convergence of Ri and 
i?2 in the points {x, y) G (0, 1]^ for which Hg„ {{yj {x + y)}) = follows from Lemma 7.1; 
see, for instance, [3], Theorem 5.2(iii) (note that by the moment constraints (2.4), H0/2 
is a probability measure). Since Hg^ can have at most countably many atoms, Ri and 
i?2 converge in all {x,y) S (0,1]^, except for at most countably many rays through the 
origin. □ 

Proof of Corollary 4.3. By the continuous mapping theorem, it suffices to show that 

(s(^„))-i/2^^(4)Vfc(^„ _ e^) 4 iv(o, /p) 

with Ip being the px p identity matrix. By condition (CI) of Theorem 4.2, the map 9 ^ 

D^p{9) is continuous at 9q so that by the continuous mapping theorem, D^(9n) -D;^(0o) 
as n ^ 00. Slutsky's lemma and (4.1) yield 

D^i9n)Vk{9n - 9n) ^ D^{9o)D^{9o)-^B = B 

as n ^ 00. By Lemma 7.2 and the assumption that the map 9 i-^ Hg is weakly continuous, 
S(^n)"^/^ ^ I](6'o)"i/^ Applying Slutsky's lemma once more concludes the proof. □ 

Proof of Theorem 4.4. We will show that for the Skorokhod construction introduced 
before the proof of Theorem 4.2, 



(k{ln(x,y) - l{x,y;9 

n 

)y ~{B{x,y)~Dn., )D^{9nor'B) )dxdy 

[0,1V 

as n — > 00. The left-hand side of the previous expression is less than or equal to 
sup \Vk{ln{x,y) - l{x,y;9n)) - B{x,y) + Di(^^y.g){9na)D^{9-Hoy^B\ 

0<x,y<l 

{Vk{in{x,y) - l{x,y]9-Ho)) + B{x,y))dxdy 
\\/k{l{x,y;9no) ~ K •^1 y] "n ))-A )D^{9n,)-^B\dxdy 

'S{h+h). 



p 

^0 
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From (7.8) with g=l,le W, we have Ii ^ 0. We need to prove that 5" = Op(l) and 
h =op(l). 

Proof of 5 = Op (1). We have 

S< sup \B{x,y)\+ sup \\rk{ln{x,y)-l{x,y;euo))\ 

0<x,y<l 0<x,y<l 

+ sup \Vk{l{x,y]euo) -Kx,y]()n)) + Dii^^^y.^g){euo)D^{9no)~^B\ 

0<x,y<l 

=: sup \B{x,y)\+Si+S2. 

0<x,y<l 

From the definition of process B, it follows that \B{x,y)\ is almost surely bounded. 
Furthermore, we have 

5i= sup \Vk{Rl{x,y)-R{x,y;0-Ho))\+o{l) 

0<x,y<l 

< sup \VkiRiix,y)-Rn{Sln{x),S2n{ym 

0<x,y<l 

+ sup \Vk{Rn{Sln{x),S2n{y)) - RiSln{x),S2n{y);0Ho))\ 
n<x,y<l 

+ sup 

In {x) , S2n {y);eno)-R{x,y;9no))\+o{l) 

a<x,y<l 

almost surely. In the last part of the proof of Theorem 4.2, we have shown that the third 
term is almost surely bounded and by the proof of [7], Theorem 2.2, we know that the 
first two terms are bounded in probability. Let M denote a constant (depending on ) 
bounding the gradient oi 9 ^ l{x,y;d) at 9-Ua in {x,y) G [0,1]^. Then, by (4.1), 

^2 < M\\Vk{9„ - 9n„)\\ + M\\D^i9n„)-'B\\ - Op(l). 
Proof of I2 — op(l). In Theorem 4.2, we have shown that 

T„ Vk{9^ - 9h„) ^ -D^{9n„r^B TV. 

By Slutsky's lemma, it is also true that (T„,iV) {N,N). By the Skorokhod construc- 
tion, there exists a probability space, call it H.* , which contains both T* and N* , where 
{T*,N*)£{T„,N) and 

(T*,N*)°^- {N*,N*). (7.9) 

Set 9^ := T*/Vk + 9no ^ T^/Vk + 617^0 = 0,,. Let il^ C ft* he a set of probabihty 1 on 
which N* is finite and the convergence in (7.9) holds. We will show that on il^, 

I2 ■= X^{x,y)dxdy 
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:= // \Vk{lix,y;0*„)-lix,y;0Ho))-Di^.,,y,e)i0no)N*\dxdy 

converges to zero. Since /| = /2, the above convergence (namely /| 0) will imply that 

p 

/2 — > 0. To show that /| converges to zero on Qq, we will once more apply the dominated 
convergence theorem. Hereafter, we work on JIq. 

(i) Pointwise convergence of X*{x,y) to zero. We have that 

X*ix, y) < \Vk{l{x, y; e^) - l{x, y; 0« J ~ A(x,,;e)(e^^o)(^: - ^«o))l 
+ \Di^.,y.,e){0noKT*-N*)\. 

Because of (7.9), differentiability oi 9 i-^ l{x, y; 9) and continuity of matrix multiplication, 
the right-hand side of the above inequality converges to zero for all (x, y) S [0, 1]^. 

(ii) Domination of X*{x,y). Let M be as above. Since the sequence (T*) = {^{9^ — 
9ho)) is convergent, and hence bounded, we have 

sup X:{x,y) < M\\Vk{9: - 9n,)\\+mN*\\ = 0(1). 

0<x,y<l 

This concludes the proof of domination and hence the proof of /2 — * 0. □ 



Proof of Lemma 6.1. Without loss of generality, we can assume that F is itself a 
parallel elliptical distribution, that is, {X,Y) is given as in (6.1) with p = in (6.2). 
Under the assumptions of the lemma and by [18], Theorem 2.3, there exists a function 
h : [0, oo)^ K such that as t j and for all {x, y) G [0, oo)^, 

t-ip{l _ F^{X) < to, 1 - F2(y) < ty} - R{x, y; v) 

A{F^{l~t)) ^ri^x^y)- U-iUj 

Moreover, the convergence in (7.10) holds uniformly on {{x,y) <E [0,oo)^:x^ + j^j 
and the function h is bounded on that region; see [18] for an explicit expression of the 
function h. 

Condition (6.4) obviously implies that z t-^ V{Z > z) is regularly varying at infinity with 
index —v. Hence, the same is true for the function I — F2; see [16]. By [2], Proposition 1.5.7 
and Theorem 1.5.12, the function xt-^ |j4(_F'2*^(1 — l/a;))| is regularly varying at infinity 
with index (3/i^. Hence, for every a < — /3/i^, we have A{F.^{1 — 1/x)) = o{x~°') as a; — > 00 
or A{F^{1 — t)) = o{t") as t ], 0. As a consequence, for every a < —(i/v, we have, as 1 1 0, 

t-ip{l _ Fi{X) < to, 1 - F2{Y) < ty} - R{x, y; v) = 0(t"), 



uniformly on {{x,y) S [0,oo)^ :x^ +2/^ = 1}. Uniformity on {{x,y) G [0,cx))^ :x + y = \\ 
now follows as in the proof of [7], Theorem 2.2. □ 
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